Skip to content

observability: phase 6.1 PR-C — 4 conformance fixtures#24

Merged
chris-colinsky merged 2 commits into
mainfrom
chore/phase-6-1-pr-c-conformance-fixtures
May 9, 2026
Merged

observability: phase 6.1 PR-C — 4 conformance fixtures#24
chris-colinsky merged 2 commits into
mainfrom
chore/phase-6-1-pr-c-conformance-fixtures

Conversation

@chris-colinsky
Copy link
Copy Markdown
Member

Summary

PR-C of Phase 6.1: drive four spec observability conformance
fixtures end-to-end through the existing
test_observability harness. Pure test-side work — no
engine, observer, or correlation changes.

Driven (this PR)

  • 002-otel-subgraph-hierarchy — §4.5 synthetic subgraph
    dispatch span parents under the invocation; inner-node spans
    parent under the dispatch span with the nested
    ["outer_sub", "inner_x"] / ["outer_sub", "inner_y"]
    namespace.
  • 003-otel-error-status — §4.2 ERROR status mapping for
    the node_exception case
    (status_description == "node_exception", recorded
    exception event, openarmature.error.category
    attribute, ok_node stays OK).
  • 007-otel-retry-attempt-spans — both sub-cases. Three
    sibling attempt spans with distinct attempt_index 0..2,
    single shared invocation parent.
    ERROR/ERROR/OK on retry-succeeds-third;
    ERROR×3 on retry-exhausts.
  • 011-otel-determinism — signature-tuple comparison
    (name, status_code, sorted attrs minus the canonical
    non-deterministic-by-design ignore set: invocation_id +
    auto-generated correlation_id) across two runs of the
    same graph.

Deferred to follow-up PRs

Each remaining _DEFERRED_FIXTURES entry now spells out the
specific gating-PR + spec/architectural rationale rather than
the generic "deferred to Phase 6.1" placeholder:

  • 004-otel-routing-error-attribution → PR-C.1 (gated on
    proposal 0012's spec amendment + v0.9.0 spec submodule
    bump). Routing-error attribution requires the §3/§6 ordering
    swap so RoutingError lands on the preceding node's
    completed event with error populated.
  • 006-otel-fan-out-instance-attribution → PR-C.2.
    Spec §5.4 mandates per-instance dispatch spans nested between
    the fan-out node span and the inner-node spans, plus three
    fan-out-config attributes the observer doesn't currently
    surface (item_count, concurrency, error_policy).
    Current observer opens one shared synthetic span at the
    namespace prefix instead.
  • 010-otel-log-correlation → PR-C.3.
    Needs synchronous observer prep hook (prepare_sync) so
    the engine task can attach the observer's span to OTel
    context for the duration of node-body execution. Observer
    span creation runs on the worker task today and isn't
    available synchronously after _dispatch_started.

PR-C.1 / PR-C.2 / PR-C.3 are mutually independent and land in
any order once their respective gates clear (PR-C.2 + PR-C.3
have no gate; PR-C.1 waits on the spec submodule bump).

Two minor YAML translations kept inline

Both fixture-specific so the shared adapter stays unchanged
(the adapter is used by other test suites; this PR doesn't
touch it):

  • Fixture 007 flaky.fail_count: N rewritten to the
    adapter's failure_sequence shape before build_graph.
  • Fixture 011 when: {field: counter, gt: 0} edge syntax
    rewritten to the adapter's
    condition: {if_field, equals, then, else} shape via
    _translate_011_when_edges (the deterministic input means
    counter == 1 always, so gt: 0 is functionally
    equivalent to equals: 1 for this fixture's flow).

Spec agent flagged the harness-DSL gap as a small non-blocking
follow-up — extend tests/conformance/adapter.py edge-condition
DSL with gt / gte / lt / lte / in / not_in
so future fixtures with arithmetic edge conditions don't need
per-driver translation. Tracked separately under harness backlog
in docs/phase-6-1-conformance-fillin.md.

Test plan

  • uv run pytest -q — 392 passed (was 388; net +4 from
    the new fixture drivers), 5 skipped (3 deferred
    conformance fixtures + 2 pre-existing)
  • uv run pyright — 0 errors
  • uv run ruff check . && uv run ruff format — clean
  • No engine/observer/correlation changes — diff is one
    file (tests/conformance/test_observability.py)

Drive four spec observability conformance fixtures end-to-end
through the existing test_observability harness:

- 002-otel-subgraph-hierarchy: §4.5 synthetic dispatch span
  parents under invocation; inner-node spans parent under
  dispatch span with the nested ["outer_sub", "inner_x"] /
  ["outer_sub", "inner_y"] namespace.
- 003-otel-error-status: §4.2 ERROR status mapping for the
  node_exception case (status_description == "node_exception",
  recorded exception event, openarmature.error.category
  attribute, ok_node stays OK).
- 007-otel-retry-attempt-spans: both sub-cases. Three sibling
  attempt spans with distinct attempt_index 0..2, single shared
  invocation parent. ERROR/ERROR/OK on retry-succeeds-third;
  ERROR x3 on retry-exhausts.
- 011-otel-determinism: signature-tuple comparison
  (name, status_code, sorted attrs minus the canonical
  non-deterministic-by-design ignore set: invocation_id +
  auto-generated correlation_id) across two runs.

Two minor YAML-shape translations kept inline in the
fixture-specific drivers so the existing adapter stays
unchanged (the adapter is shared with other suites; this PR
doesn't touch it):

- Fixture 007 flaky.fail_count=N rewritten to the adapter's
  failure_sequence shape before build_graph.
- Fixture 011 when:{field, gt: 0} edge syntax rewritten to the
  adapter's condition:{if_field, equals, then, else} shape via
  _translate_011_when_edges (gt:0 with the fixture's
  deterministic input is functionally equivalent to equals:1
  because counter==1 always under that input).

Map updates:

- _SUPPORTED_FIXTURES grows from 4 to 8.
- _DEFERRED_FIXTURES shrinks from 7 to 3 (004 / 006 / 010);
  each remaining note now spells out the specific gating-PR +
  spec/architectural rationale (PR-C.1 + proposal 0012 /
  PR-C.2 / PR-C.3) rather than the generic "deferred to
  Phase 6.1" placeholder.

Module docstring rewritten to distinguish the 8 driven
fixtures from the 3 deferred ones.

No engine, observer, or correlation changes. Pure test-side
work.

392 tests pass (was 388; net +4 from the new fixture drivers).
5 skipped (3 deferred conformance fixtures + 2 pre-existing).
Pyright clean.
Copilot AI review requested due to automatic review settings May 9, 2026 18:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the tests/conformance/test_observability.py conformance harness to drive four additional OpenTelemetry observability fixtures end-to-end (002, 003, 007, 011) and updates the deferred-fixture notes to include more specific gating rationale.

Changes:

  • Add drivers and assertions for fixtures 002 (subgraph hierarchy), 003 (error status), 007 (retry attempt spans), and 011 (determinism).
  • Refine deferred-fixture skip messages with concrete gating requirements and follow-up PR pointers.
  • Add small fixture-specific YAML translations inside the test driver for fixtures 007 and 011.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/conformance/test_observability.py Outdated
Comment thread tests/conformance/test_observability.py Outdated
Comment thread tests/conformance/test_observability.py Outdated
- Fix invocation span status propagation. The OTelObserver was
  unconditionally calling set_status(OK) on the invocation span
  in _close_invocation_span; OTel doesn't auto-propagate child
  status to parents, so the spec §4.2 / fixture 003 contract
  ("invocation span ends ERROR when a child errored") wasn't
  satisfied. _handle_completed now sets ERROR on the invocation
  span when a child errors; _close_invocation_span no longer
  forces OK (clean invocations end UNSET, which exporters map
  to OK; ERROR is preserved by OTel SDK status precedence).
  Driver _run_fixture_003 gains the explicit invocation-status
  ERROR assertion + docstring fix.
- Fix _signature in _run_fixture_011 to include parent_name.
  Previously the docstring claimed "hierarchy" and the comment
  said "parent-by-name" but the signature ignored span.parent
  entirely. Now looks up parent.span_id in a per-run by-id map
  and includes the parent's name; a hierarchy regression where
  a node reparented to a different ancestor now surfaces as a
  signature divergence.
- Fix comment drift in _run_fixture_011_case: the translated
  edge condition is "equals: 1" (matching the deterministic
  counter == 1 input), not "equals: True".
@chris-colinsky chris-colinsky merged commit 8917cad into main May 9, 2026
5 checks passed
@chris-colinsky chris-colinsky deleted the chore/phase-6-1-pr-c-conformance-fixtures branch May 9, 2026 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants